Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization
نویسندگان
چکیده
The integration of multisensory information plays a crucial role in autonomous robotics. In this work, we investigate how robust multimodal representations can naturally develop in a self-organized manner from co-occurring multisensory inputs. We propose a hierarchical learning architecture with growing self-organizing neural networks for learning human actions from audiovisual inputs. Associative links between unimodal representations are incrementally learned by a semi-supervised algorithm with bidirectional connectivity that takes into account inherent spatiotemporal dynamics of the input. Experiments on a dataset of 10 full-body actions show that our architecture is able to learn action-word mappings without the need of segmenting training samples for ground-truth labelling. Instead, multimodal representations of actions are obtained using the co-activation of action features from video sequences and labels from automatic speech recognition. Promising experimental results encourage the extension of our architecture in several directions. Keywords—Human action recognition, multimodal integration, self-organizing networks.
منابع مشابه
Emergence of multimodal action representations from neural network self-organization
The integration of multisensory information plays a crucial role in autonomous robotics to forming robust and meaningful representations of the environment. In this work, we investigate how robust multimodal representations can naturally develop in a self-organizing manner from co-occurring multisensory inputs. We propose a hierarchical architecture with growing self-organizing neural networks ...
متن کاملNeural Gas for Sequences
For unsupervised sequence processing, standard self organizing maps (SOM) can be naturally extended by recurrent connections and explicit context representations. Known models are the temporal Kohonen map (TKM), recursive SOM, SOM for structured data (SOMSD), and HSOM for sequences (HSOM-S). We discuss and compare the capabilities of exemplary approaches to store different types of sequences. A...
متن کاملAssociative Neural Models for Biomimetic Multi- Modal Learning in a Mirror Neuron-based Robot
By using neurocognitive evidence on mirror neuron system concepts the MirrorBot project has developed neural models for intelligent robot behaviour. These models employ diverse learning approaches such as reinforcement learning, self-organisation and associative learning to perform cognitive robotic operations such as language grounding in actions, object recognition, localisation and docking. ...
متن کاملTowards the Acquisition of a Sensorimotor Vocal Tract Action Repository within a Neural Model of Speech Processing
While a mental lexicon stores phonological, grammatical and semantic features of words, a vocal tract action repository is assumed to store inner motor and sensory representations of speech items (i.e. the sounds, syllables and words) of the speaker’s native language. On the basis of a neural model of speech processing, which comprises important cognitive and sensorimotor aspects of speech prod...
متن کاملUnsupervised Grounding of Spatial Relations
We present an unsupervised connectionist model for grounding color, shape and spatial relations of two objects in 2D space. The model constitutes a two-layer architecture that integrates information from visual and auditory inputs. The images are presented as the visual inputs to an artificial retina and fiveword sentences describing them (e.g. “Red box above green circle”) serve as auditory in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015